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Without measurement errors in predictors, discontinuity of a non- 
parametric regression function at unknown locations could be esti¬ 
mated using a number of existing approaches. However, it becomes a 
challenging problem when the predictors contain measurement errors. 

In this paper, an error-in-variables jump point estimator is suggested 
for a nonparametric generalized error-in-variables regression model. A 
major feature of our method is that it does not impose any parametric 
distribution on the measurement error. Its performance is evaluated 
by both numerical studies and theoretical justifications. The method 
is applied to studying the impact of Medicare Levy Surcharge on the 
private health insurance take-up rate in Australia. 


1. Introduction. This paper is motivated by our attempt to study the 
impact of the Medical Levy Surcharge (MLS) tax policy on the take-up rate 
of the private health insurance (PHI) in Australia. People in Australia are 
liable of MLS (which is about 1 percent of their annual taxable incomes) if 
they do not buy PHI and their annual taxable incomes are above a certain 
level. For example, the income threshold for single individuals was $50,000 
per annum in the 2003-2004 financial year, where the dollar sign “$” used 
here and throughout the paper represents the Australian Dollar (AUD). The 
major purposes of this tax policy were to give people more choices of health 
insurance and to take a certain pressure off the public medical system. It 
was expected that this policy would generate a jump in the PHI take-up 
rate around the taxable income threshold. The size of the jump could be 
used to evaluate the impact of the policy on the PHI take-up rate. However, 
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the jump location may not be exactly at the threshold for the reasons given 
below. First, the $500 MLS at the threshold could be lower than the net 
cost of PHI, and taxpayers usually consider buying PHI only when the MLS 
exceeds the cost of PHI at higher income levels. Second, the MLS is collected 
when taxpayers file their tax returns after the hnancial year is finished, while 
the decision to buy the PHI should be made before the financial year starts. 
Because it is difficult for people to predict their taxable incomes accurately, 
they may not be aware of the MLS issue until it occurs. So, the jump location 
is actually unknown and needs to be estimated properly before the jump size 
can be estimated. To estimate the jump location accurately is also helpful 
for understanding the demand for PHI in Australia. This is because from 
the costs of the PHI and the difference between the estimated jump point 
and the threshold, one can infer the true value of PHI to the taxpayers. 

There is a large literature using tax changes as a source of variation in 
the after-tax price of health insurances. Most of these studies are for the 
US-employer-provided health insurances. See Gruber and Poterba (1994), 
Finkelstein (2002), Rodriguez and Stoyanova (2004), and Buchmueller, Di- 
dardo and Valletta (2011) for a few examples. Rarely is it the case that the 
tax changes could be argued as exogenous. Jumps caused by policy design, 
such as the MLS in Australia, have been argued to be exogenous locally for 
the individuals around it [Lee and Lemieux (2010)]. 

In the statistical literature, jump detection in regression functions has 
been discussed by several authors, including Joo and Qiu (2009), Muller 
(1992, 2002), Qiu (1991, 1994), Qiu and Yandell (1998), Wu and Chu (1993), 
and the references therein. See Qiu (2005) for an overview on this topic. All 
existing jump detection methods assume that the explanatory variable does 
not have any measurement error involved. Meanwhile, the existing literature 
on the error-in-variables regression modeling assumes that the measurement 
error distribution is known or it can be estimated reasonably well before¬ 
hand and that the related regression function is continuous. See, for example, 
Carroll et al. (2006), Carroll, Maca and Ruppert (1999), Comte and Taupin 
(2007), Cook and Stefanski (1994), Delaigle and Meister (2007), Delaigle 
(2008), Fan and Masry (1992), Fan and Truong (1993), Hall and Meister 
(2007), Liang and Wang (2005), Staudenmayer and Ruppert (2004), Ste¬ 
fanski (2000), Stefanski and Cook (1995), and Taupin (2001). Our case is 
much more complicated. The available data to us are drawn from the “1% 
Sample Unit Record File of Individual Income Tax Returns” for the 2003- 
2004 financial year, that was developed by the Australian Tax Office (ATO) 
for research purposes. Out of privacy consideration, the ATO intentionally 
perturbed the income data by multiplying random numbers to the income 
data, and the true distribution of the random numbers is unrevealed. Our 
major task here is to estimate a jump point and the jump magnitude in 
a regression model when the regressor contains measurement errors with 
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an unknown distribution. This problem is much more challenging to handle, 
compared to the ones discussed in the papers mentioned above. For instance, 
the deconvolution kernel regression estimator proposed by Fan and Truong 
(1993) assumes that the characteristic function of the measurement error dis¬ 
tribution is completely known. Such detailed knowledge of the measurement 
error distribution is unavailable in the current PHI problem. Also, when the 
error distribution was misspecified in the conventional deconvolution prob¬ 
lems, Meister (2009) pointed out that the Mean Integrated Squared Error 
(MISE) of the deconvolution kernel estimator was not bonnded from above. 
Therefore, this estimator can perform badly when the error distribution is 
not correctly specified. 

In this paper, we propose a generalized error-in-variables regression model 
for describing the relationship between the PHI take-up rate and a person’s 
annual taxable income. In the model, a jump point is included to accommo¬ 
date the possible abrupt impact of MLS on the PHI take-up rate. A novel 
jump detector is proposed as well, which takes into account the measnrement 
errors. One feature of our method is that it does not require the measure¬ 
ment error distribution to be specified beforehand, making it applicable to 
the current PHI problem and other real problems. 

The remainder of the article is organized as follows. In Section 2, our 
proposed model and jump detector are described in detail. In Section 3, 
some statistical properties of the proposed jump detector are discussed. In 
Section 4, its numerical performance is evaluated. In Section 5, an in-depth 
analysis of the PHI data is presented. Several remarks conclude the article 
in Section 6. Some technical details are provided in a supplementary file. 


2. Proposed methodology. Let {{Wi,Yi) :i = 1,... ,n} be a sample of n 
independent and identically distribnted (i.i.d.) observations from the models 
described below: 


(i) The conditional distribution of 1^1 A, = x has probability density func¬ 
tion (p.d.f.) or probability mass function (p.m.f.) from the exponential family 


( 2 . 1 ) 


exp 


y0{x) — b{9{x)) 

a{4>) 


+ c{y, 4>) 


where Xi is the fth observation of the unobservable explanatory variable A, 
Yi is the zth observation of the response variable T, 6{x) is the canonical 
parameter when Aj = x, (p is a scale parameter, b{9{x)), and c{y,(j)) 

are certain functions of 4>, 9{x), and {y,4>), respectively. 

(ii) Wi is the observed value of Aj with a measurement error, and their 
relationship can be described by the model 


(2.2) 


Wi — Xi + (TnUi, 
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where cr^ > 0 denotes the standard deviation of the measurement error in 
Xi, and Ui is the standardized measurement error with mean 0 and variance 
1. It is also assumed that C/j’s are i.i.d., Ui is independent of both Xi and Yi, 
the distribution of Ui, denoted as fu, and the distribution of Xi, denoted 
as fx, are both unknown. 

In model (2.1), 9{x) relates Y) to Xi. And, model (2.1) includes many 
commonly used distributions (e.g.. Normal, Poisson, Binomial) as special 
cases. In the current PHI problem, because ATO perturbed the income data 
by multiplying each original income observation by a random number, the 
income variables are used in log scale so that model (2.2) with additive 
measurement error is appropriate. More specihcally, Xi and Wj denote the 
true and observed annual taxable incomes in log scale, respectively, Y) de¬ 
notes the status of PHI take-up (Y) equals 1 when a specific person buys 
the PHI and 0 otherwise), and Yi\Xi = x is assumed to follow the Bernoulli 
distribution with the probability of success being p{x) = P{Yi = l\Xi = x). 
In such cases, the quantities 6{x), a{4>), b{9{x)), and c{y,4>) can be specihed 
as follows: 

b{9) =log{l + exp{9)), c{y,(p) = 0. 


As discussed in Section 1, the tax policy MLS is expected to generate a jump 
in p{x). So, a jump in 9{x) is expected as well. 

In the PHI problem, it is important to estimate the jump position in 9{x) 
in order to study the impact of the tax policy MLS on the PHI take-up. Our 
proposed jump detector is described below. The true jump position of 9{x) 
is assumed to be at s which is unknown. Without loss of generality, let us 
assume that the support of fx is [0,1] and s G (0,1). Let m{x) denote the 
conditional mean of Y given X = x. Then, it can be checked from (2.1) that 

m(x) = E{Y\X = x) = b'{9{x)). 


For any given point x G {2hn, 1 — 2hn), let us consider its right-sided neigh¬ 
borhood \x,x + hn], where > 0 is a bandwidth parameter. When there is 
no measurement error in X, m{x+) = limAx-^-o-i- m.(x-|-Ax) can be estimated 
reasonably well [cf. Qiu (2005), Chapter 2], by 


(2.3) 


Y^YiKr 

i=l 


Xj-x 

hn 


1 = 1 


Xi-x\ 
hn )' 


where is a decreasing kernel function with the right-sided support (0,1]. 
In the case when X has measurement errors involved, the estimator in (2.3) 
is unavailable because Aj’s are no longer observable. It may be problematic 
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Fig. 1. The solid line denotes the conditional mean function m{x) = E{Y\X = x) that has 
a jump at s = 0.5 (marked by the vertical dashed line). The dark points denote observations 
of (W,Y) where W is the observed value of X with measurement error involved. It can be 
seen that the values of X corresponding to those W values that are close to the true jump 
location (e.g., those fall between the two vertical dotted lines) are likely to be on both sides 
of the jump. 


if we simply replace Xj’s by Wj’s in (2.3) because we do not know whether 
a specific Xi is located on the right-hand side of x or not when its observed 
value Wi is on the right-hand side of x, due to the measurement error. 
However, as demonstrated in Figure 1, the following fact can be observed: 
if Wi is close to the true jump point s, then the corresponding unobservable 
Xi is likely to be on the other side of the jump location and, consequently, 
Yi follows a distribution with the parameter 9[Xi) which could be very 
different from 9{Wi). A one-sided kernel estimator dehned in (2.3) with the 
Xi's replaced by the corresponding VFj’s actually averages observations on 
both sides of the jump location. Thus, the impact of the measurement error 
could be severe in such cases. On the other hand, in the case when Wi is 
farther away from s, such an impact becomes smaller. Based on this fact, let 
us consider a one-step-right neighborhood of x, defined to be Nn,r{x] hn) '■= 
{x + hn,x + 2hn), and define 


(2.4) 


rrin 


.{x+) = Y,y^Kr 


2=1 


Wj- {x-\- hn) 
hn. 


2=1 


Wj - (x + hn) 
hn 
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where the kernel function Kj. is the same as the one in (2.3). If x is at the 
jump location, then fhn,rix+) should be a weighted average of observations 
that are on the right-hand side of x since the observations in the neighbor¬ 
hood Nn,r{x]hn) are all quite far away from the jump point. This would 
limit the possibility of averaging observations on both sides of x, and thus 
diminish the impact of the measurement error. However, this estimator may 
still have some bias for estimating m{x+) because the X values of most ob¬ 
servations used in rhn,r{x+) are at least one bandwidth above x. To address 
this issue, let us dehne 



(2.5) mn{x+) 


where the kernel function Kr is the same as the one in (2.3), K* is an¬ 
other decreasing kernel function with support [0,1], pn = ioaaXx<Wi<x+hn 
\‘fnn(Wi+) - fhn,rix+)\, and 



is the conventional one-sided kernel estimator of m{x+). The intuitive ex¬ 
planation of (2.5) is as follows. From its definition, rh^{Wi+) is mainly 
determined by observations close to W* and these observations are mostly 
in the neighborhood [x,x + hn]- For some of these observations, the corre¬ 
sponding Xi’s may be on the left-hand side of x and, thus, the impact of 
the measurement error on the one-sided estimator in (2.5) could be severe. 
On the other hand, the measurement error does not have much impact on 
'mn,rix+), as explained above. The bias in fhn,r{x+) for estimating m{x+), 
due to the fact that it uses many observations that are at least one band¬ 
width on the right-hand side of x, is considered significantly smaller than 
the bias in rh^(Wi+) due to measurement error, because the former is due 
to the variation of m(-) in a small continuity region while the latter is caused 
by the jump. So, the difference fh* (VFj-|-) — 'mn,r(x+) can provide us a mea¬ 
sure of the impact of the measurement error in Wi on estimation of m{x+). 
If the difference is small, then the impact of the measurement error in Wt 
should be small. Otherwise, its impact should be large. The kernel function 
K* in (2.5) aims to eliminate such an impact. Therefore, mn{x+) should 
provide a reasonable estimator for m{x+). An estimator of m(x—) can be 
constructed similarly to (2.5), which is denoted as fhn{x—). Then, the true 
jump location s can be estimated by 

(2.6) Sn = arg max |m„(x-)-) - m„(x-)|, 

xe{2hn,l—2hn) 

and the corresponding jump magnitude d in m{x) can be estimated by 


(2.7) 


dn — )• 
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It should be pointed out that, although the exponential family in (2.1) is pa¬ 
rameterized using the canonical parameter 9{x), the mean parameter m{x) 
is often easier to interpret in practice. For this reason, both the jump loca¬ 
tion and the jump magnitude are discussed above in terms of m{x), instead 
of 6{x). In Section 3, we will show that under certain regularity conditions, 
'Sn is a consistent estimator. 

In the proposed jump detector (2.6), there is one parameter hn to choose. 
According to Gijbels and Goderniaux (2004), jump detectors based on ker¬ 
nel smoothing in cases without measurement error depend heavily on the 
choice of bandwidth parameters. In simulation studies, the true jump loca¬ 
tion s could be known. Then, hn can be chosen to be the one that mini¬ 
mizes \'Sn{hn) — s|, where has been written as 'Sn{hn) for convenience of 
discussion. In practice, s is usually unknown. In such cases, we suggest the 
following bootstrap bandwidth selection procedure: 

• For a given bandwidth value > 0, apply the proposed jump detec¬ 
tion procedure (2.4)-(2.6) to the original data set {{Wi,Yi), {W 2 ,Y 2 ),..., 
{Wn,Yn)}, and obtain an estimator of s, denoted as 'sn{hn)- 

• Draw with replacement n times from the original data set to obtain 

the first bootstrap sample, denoted as ..., 

{W^n\Y''n'^)]. 

• Apply the proposed jump detection procedure (2.4)-(2.6) to the hrst boot¬ 
strap sample, and obtain the first bootstrap estimator of s, denoted as 

'^n\hn)- 

• Repeat the previous two steps B times and obtain B bootstrap estimators 

of S: {s^n\hn),'^n\hn), ■ ■ . , (/l„)}. 

• Then, the bandwidth hn is chosen to be the minimizer of 

I ^ 

(2.8) mm—'^\sn{hn)-^n\hn)\. 

hn D 

k=l 

It should be pointed out that, as a byproduct of the above bootstrap 
bandwidth selection procedure, a confidence interval for s can be constructed 
from the empirical distribution of {^n\hn),'^n\hn), ■ ■ ■ ,'^n\hn)}, where 
hn denotes the bandwidth selected by the bootstrap. More specifically, for a 
given significance level a G (0,1), a 100(1 — a)% confidence interval for s is 

defined to be {sn,a/ 2 {K),'Sn,i-oc/ 2 {K)), where Sn,a/ 2 {hn) and Sn,i-a/ 2 {K) 
denote the (Q;/2)th and (I — a/2)th quantiles of the empirical distribution 

of {s^n'’(hn)Mn^(hn), ■ ■ ■ ,Sl^^(^n)}- 

3. Statistical properties. In this section, we discuss some statistical prop¬ 
erties of the proposed jump detector defined in (2.6). To this end, we have 
the theorem below. 
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Theorem 1. Assume that {{Wi,Yi),{W 2 -,Y 2 ),... ,{Wn,Yn)} are i.i.d. 
observations from models (2.1) and (2.2), and the following conditions are 
satisfied: 

(1) 9{-) is a hounded, piecewise continuous function with a single jump 
at s and its first-order derivative is also a bounded function, 

(2) a(-), b{-), b'{-), and b"[-) are all bounded and continuous functions, 

(3) {b')~^{-) exists and it is strictly monotone and Lipschitz-1 continuous^ 
in any compact subset of the range of6{-), 

(4) the support of fx is [0,1] and s G (0,1), 

(5) fx is continuous, bounded, and positive on (0,1), 

(6) fu is a continuous function and has a positive value at 0, 

(7) the kernel functions K* and Kr are Lipschitz-1 continuous density 
functions with the same support [0,1], 

(8) the bandwidth h^ satisfies the conditions that h^ = o(l), and ilogn)^^^/ 
{n^/'^hn) = o(l), for some 1 ] > 1/2. 

Then, we have the following results: 


(i) If an/hn = o{l), then 


mn(x+) - fhnix-) = m{x+) - m{x-) + O + 0(/3n) as., 


O', 


2/3 




where j3n = hn+ ■ 

' fin 


(ii) If lim^^cxD (Xn/hn = C, for some C >0, then 


fhnix-\-) - fhnix-) 

= m(x+) — mix—) — (m(x+) — mix—))CK,r + 0(/3n) a.s.. 


where 


and 


CK,r = 


Kf*iw) = Kriw)K 


JiKriw)Pi\U\>w/C)dw 


Kf*iw) dw 


^ I Jq Kriv)Piv + w< CU <v + l)dv 

fg Kriv)Piv < CU <v + l)dv 


(iii) If the conditions in either (i) or (ii) hold, then we have 
\sn- s\=Oihn), a.s. 


^Given an interval I GlZ, a function g : I ^TZ is called Lipschitz-l continuous if there 
exists a real constant C > 0 such that, for all xi and X2 £ I, \g{xi) — g{x2) \ < C|a;i — xal- 
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Theorem 1 shows that the jump detector defined in (2.6) provides a statis¬ 
tically consistent estimator of s under some regularity conditions. Its proof 
is given in a supplementary file [Kang et al. (2015)]. In result (ii) of the 
above theorem, if K* and Kr are both decreasing on [0,1], then we have 

f^K;*{w)P{\U\>w/C)dw ^ J^Kriw)P{\U\ >w/C)dw 
/q K** (w) dw Kr (w) dw 


It can be checked that if the conventional kernel estimators are used when 
dehning the jump detection criterion [i.e., rh^{x+) is used], then the asymp¬ 
totic bias for m* (x-|-) — m* (x—) to estimate m{x+) — m{x—) is {m{x+) — 

m(x—)) ‘^° Ar.(w)P(|[/|>w/C)dw -v^rhich is larger than the asymptotic bias 

foKr{w)dw ’ “ ^ 

, . . , ..pK**{w)P{\u\>w/C)dw , 

(mix-)-) — mix—j)^ - ^ -—- when we use m„(x-)-) — mn{x—) to 

Jq K**[w)dw 


estimate m(x-|-) — m(x—). Therefore, the second kernel function K* used in 
(2.5) is helpful in reducing the asymptotic bias. In Theorem 1, it is required 
that the measurement error variance cr^ tends to 0 when the sample size n 
increases. In the literature, it has been pointed out that this condition is 
needed for consistently estimating the regression function when its obser¬ 
vations have measurement errors involved and when little prior information 
about the measurement error distribution is available [cf. Delaigle (2008)]. 


4. Numerical studies. In this section, we present some results regard¬ 
ing the numerical performance of the proposed jump detector described 
in Section 2, which are organized in two subsections. Section 4.1 includes 
some simulation examples related to the jump detector defined in (2.6). 
Section 4.2 compares the proposed jump detector to the difference-kernel- 
estimation (DKE) procedure that ignores the measurement error [cf. Qiu 
(2005), Section 3.2]. 


4.1. Numerical performance of the proposed jump detector. In this sub¬ 
section, the performance of the proposed jump detector is evaluated using 
two simulated examples. In each example, we consider cases when the sam¬ 
ple size n equals 100 or 200, fx ~ Unif[0,l], and fu is either a Normal, 
a Laplace, or a Uniform distribution, with E{U) = 0 and Var(t/)/Var(X) 
fixed at 15%. In each combination of n and fu, the simulation is repeated 
100 times. For each given bandwidth hn, 100 values of the Absolute Er¬ 
ror (AE), defined as AE{hn) = l^n(^n) — s], are computed. Their average is 
called the Mean Absolute Error (MAE) and is denoted as MAE(h„). The 
minimizer of MAE(/i„) is called the optimal bandwidth and is denoted as 
hopt ■ Ill each replicated simulation, we also compute a bandwidth value us¬ 
ing the proposed bootstrap procedure. The average of such 100 bandwidth 
values is called the bootstrap bandwidth, denoted as /ibt- In each example. 
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the values of hopt, hbt, MAE(/iopt), MAE(/ibt), and the empirical coverage 
probability (CP) of the 95% confidence interval (see its description at the 
end of Section 2) computed from the 100 replicated simulations are pre¬ 
sented. In the case when n = 100 and fjj is Normal, the sample that gives 
the median value of AE(/iopt) is denoted as ^so- For that sample, the es¬ 
timated jump location by (2.6) with hn = hbt and the corresponding 95% 
confidence interval for s will be presented. Throughout this section, if there 
is no further specification, the bootstrap sample size B is chosen to be 999, 
and K* and used in (2.4)-(2.6) are both chosen to be the Epanechnikov 
kernel function. 

In the first example, the conditional distribution of Y\X = x is assumed 
to follow the Normal distribution with density 

V27r X 0.012 2x0.012 J’ 

where 


f—sin(27rx), if x < 0.5; 

I — sin(27rx) -t- 1, otherwise. 


Figure 2 shows a realization of the sample the true function 9{-) (solid 
line), the estimated jump location (vertical dashed line), and a 95% con- 



w 

Fig. 2. A realization ofSso in the first example with the true function of 9{-) (solid line), 
the estimated jump location fin (vertical dashed line), and a 95% confidence interval for 
the true jump location s (vertical dotted lines). 









JUMP DETECTION WITH ERROR IN VARIABLES 


11 


Table 1 

Numerical summary of the first simulation example based on 100 replicated simulations 


n 

fu 

ho-pt 

/j-bt 

MAE(/ropt) 

MAE(/ibt) 

CP 

100 

Normal 

0.3000 

0.3008 

0.0290 

0.0293 

0.95 


Laplace 

0.2931 

0.2985 

0.0292 

0.0308 

0.98 


Uniform 

0.2767 

0.2910 

0.0335 

0.0347 

0.96 

200 

Normal 

0.2991 

0.3074 

0.0232 

0.0245 

0.96 


Laplace 

0.2902 

0.3002 

0.0191 

0.0195 

0.94 


Uniform 

0.2721 

0.2917 

0.0232 

0.0239 

0.97 


fidence interval for the true jump location s (vertical dotted lines). It can be 
seen that the proposed jump detector estimates the true jump location rea¬ 
sonably well. The numerical performance of the jump detector (2.6) based 
on 100 replicated simulations is summarized in Table 1. From the table, it 
can be seen that (i) the proposed jump detector estimates the true jump 
location reasonably well for various error distributions, (ii) the performance 
of 'Sn improves as the sample size n increases, (iii) the bootstrap bandwidths 
are slightly larger than the optimal bandwidths but they are quite close to 
each other, and (iv) the empirical coverage probabilities of the proposed con¬ 
fidence interval for s are all close to the nominal coverage probability 0.95. 

Next, we discuss the second simulation example whose setting is made 
similar to that of the PHI data. Assume that the conditional distribution of 
Y\X = X is Bernoulli with the probability of success being 


f I — if X G (0,0.5]; 

\o.5(l —x)^, if xG (0.5,1). 


Figure 3 shows a realization of 1 S 50 with the true function of p(-) (solid 
line), the estimated jump location 'sn (vertical dashed line), and the 95% 
confidence interval for s (vertical dotted lines). It can be seen from the figure 
that the sample S^q has quite severe measurement errors involved and that 
the proposed jump detector gives a reasonably good estimate of the true 
jump location. The numerical performance of the jump detector (2.6) based 
on 100 replicated simulations is summarized in Table 2. From the table, it 
can be seen that similar conclusions to those in the first example can be 
made here. 


4.2. Comparison to the DKE estimator. The DKE procedure [see Sec¬ 
tion 3.2 in Qiu (2005) for a detailed discussion] provides a good estimator 
of the true jump position when there is no measurement error involved. In 
this subsection, we compare our proposed jump detector (2.6) with the DKE 
procedure in an artificial example with similar setup to that of the PHI data. 
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Fig. 3. A realization of S 50 in the second example with the true function of p{-) (solid 
line), the estimated jump location s'n (vertical dashed line), and the 95% confidence interval 
for s (vertical dotted lines). 


The proposed jump detector is denoted as NEW and the DKE procedure 
is denoted as DKE. Assume that the conditional distribution of Y\X = x is 
Bernoulli with the probability of success being 

r||x^ + 0.15, if xG [0,0.6); 

^ ^ ~ I 12 (x - 0 . 8)3 Q 590 ^ if ^ g f]_ 

It can be seen that p{x) is a piecewise polynomial with a jump of size 0.1 
at X = 0.6, as plotted in Figure 4. In the figure, the estimated function of 
p{-) by the local linear kernel (LLK) smoothing procedure is also shown. 


Table 2 

Numerical summary of the second simulation example based on 100 replicated simulations 


n 

fu 

ho-pt 

hbt 

MAE(/ropt) 

MAE(/ibt) 

CP 

100 

Normal 

0.3329 

0.3448 

0.0407 

0.0444 

0.93 


Laplace 

0.2758 

0.2826 

0.0397 

0.0422 

0.98 


Uniform 

0.3203 

0.3247 

0.0479 

0.0484 

0.97 

200 

Normal 

0.3122 

0.3100 

0.0363 

0.0369 

0.98 


Laplace 

0.2820 

0.2816 

0.0326 

0.0336 

0.92 


Uniform 

0.2878 

0.2983 

0.0352 

0.0352 

0.94 
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X 

Fig. 4. The dashed line denotes the true curve of p{-), the solid line denotes the local 
linear kernel (LLK) estimate of p{-) using one realization of simulated data when fx{-) 
is Unif[0,1], the vertical dot-dashed line denotes the estimated jump location Sn, and the 
vertical dotted lines denote the 95% confidence interval for s. 

From the plot, it can be seen that the shape of p{-) is similar to that in the 
PHI study, which is shown in Figure 5. Note that the jump size in the PHI 
application is estimated to be 0.19 (cf. Section 5), while the jump size in this 
example is about half of that size. To make this numerical study in a similar 
setup to that of the PHI study, which has 9685 observations distributed in 
the interval [9.8,11.3], we choose the sample size n in the current example to 
be 6000 and the observations are in the design interval [0,1]. In addition, fjj 
is chosen to be AI(0,0.05^), and fx is either Unif[0,1], Beta(2,2), Beta(3,2), 
or Beta(2,3). In each case, the simulation is repeated 100 times, the optimal 
bandwidth is selected based on the 100 replicated simulations, and the mean 
and standard deviation of the 100 values of — s| and the 100 values of 
|(m„(sn+) — fhni'Sn—)) — O.lj (i.e., the absolute bias of the estimated jump 
size) are computed, respectively. They are denoted as MAE, SDAE, MABJS, 
and SDABJS. The results are presented in Table 3. 

From Table 3, it can be seen that when there is measurement error in¬ 
volved, (i) the precision of the detected jump by the proposed jump detec¬ 
tion procedure is better than that of the DKE procedure, across all different 
choices of fx, (h) the proposed procedure reduces the bias of the estimated 
jump size, which is consistent with our discussion in Section 3, and (iii) 
the proposed procedure has a slightly smaller variability for detecting the 
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Fig. 5. The solid line denotes the local linear kernel (LLK) estimate of p{-) in the PHI 
example, the dashed line denotes the left-sided estimate ofp{-), the dotted line denotes 
the right-sided estimate ofp{-), the dot-dash line denotes the absolute difference between 
the two one-sided estimates ofp{-), the long-dash vertical lines denote the 95% confidence 
interval for s, and the two-dash line denotes the estimated jump location s'n. 

jump location and about the same variability for estimating the jump size, 
compared to the DKE procedure. 

5. Analysis of the PHI data. In this section, we apply our proposed jump 
detector to the PHI data for evaluating the impact of MLS on the take-up 
rate of PHI. 


Table 3 

Numerical comparison of the proposed jump detector NEW with the DKE procedure based 
on 100 replicated simulations. MAE and SDAE denote the mean and standard deviation 
of the 100 values of |sn — s|. MABJS and SDABJS denote the mean and standard 
deviation of the 100 values of |(m„(s'rt+) — fhnfsn — )) — 0.1| 


fx 


DKE 



NEW 


MAE 

SDAE 

MABJS 

SDABJS 

MAE 

SDAE 

MABJS 

SDABJS 

Unif[0,1] 

0.01718 

0.00126 

0.02598 

0.00064 

0.01532 

0.00115 

0.00511 

0.00047 

Beta(2,2) 

0.01547 

0.00109 

0.02475 

0.00049 

0.01329 

0.00108 

0.00961 

0.00044 

Beta(3,2) 

0.01607 

0.00110 

0.02593 

0.00040 

0.01305 

0.00092 

0.00494 

0.00037 

Beta(2,3) 

0.01810 

0.00113 

0.02707 

0.00055 

0.01690 

0.00108 

0.01031 

0.00058 
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The purposes of introducing PHI in Australia were to give consumers more 
choices and take some pressure off the public medical system. However, the 
PHI take-up rate by Australians was very low at the beginning when the PHI 
was first introduced in 1984, and the take-up rate had been declining toward 
the end of 1990s (the take-up rate was only about 31 percent at that time) 
until a series of policies (including MLS) were introduced. Impact of some of 
these policy measures (e.g., Lifetime Health Cover) have been studied in a 
few studies, including Butler (2002), Freeh, Hopkins and MacDonald (2003), 
Palangkaraya and Yong (2005), and Palangkaraya et al. (2009). But the role 
of MLS has not been identified separately yet. The MLS was imposed in 
1997 on high-income taxpayers who did not have private insurances. Between 
1997-1998 and 2007-2008, the threshold of annual taxable income at which 
MLS was payable was $50,000 for singles without children and $100,000 for 
couples. For each dependent child in the household, the threshold increased 
by $3000. So, people having children may lead to multiple jumps in the 
current PHI data. Unfortunately, we do not have information on the number 
of children in a family. Also, multiple jump locations within a relatively 
narrow range would be difficult to distinguish, given the measurement error 
involved in the PHI data. To mitigate the effect of multiple jumps due to 
people having children, this paper only focuses on singles in the current PHI 
data. 

The data used here are from a confidentialized “1% Sample Unit Record 
File of Individual Income Tax Returns” for the 2003-2004 financial year, 
that was developed by ATO for research purposes. The file contains just over 
109,000 records of individual tax returns and detailed information on income 
from various sources; different types of tax deductions; taxable income; and 
the take-up of PHI by the individuals. It also contains a limited number of 
demographic variables, including gender, age group, and marital status. In 
this paper, we focus on singles between 20 and 69 years old, who were all 
subject to the same income threshold of $50,000 for the MLS. Therefore, the 
PHI take-up rate is expected to have a jump around that level of the annual 
taxable income. In the tax and transfer system or in the health insurance 
premium regime in Australia, there is no other differential treatment related 
to the PHI take-up. Other demographic covariates (such as gender and age) 
would not generate discontinuity in the take-up rate either as a function of 
the annual taxable income. So, in the current PHI data, MLS seems to be 
the only factor responsible for the jump in the take-up rate. 

As a method of confidentialization, ATO “perturbed” the income vari¬ 
ables and the deductions, and provided the following information on the 
way the data was perturbed; several random numbers within a specified 
range for each individual were generated, which were converted into a rate 
(equal probability of being positive or negative) and which was then applied 
to the various components of the tax return. These rates were applied to the 
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components in a way to try to maintain relationships with similar items. 
This was achieved by grouping the components into three broad categories: 
work or employment related income and deductions; investment income and 
deductions; and business and other income and deductions. Thus, there is 
some information about the measurement errors in the income data, but 
the actual distribution of the measurement errors is impossible to be iden¬ 
tified based on the provided information. The sample was further restricted 
to minimize the number of income sources/deduction sources so that the 
distribution of the error term could be more homogeneous, according to 
the following criteria: (1) Only those who had positive earnings as the only 
sources of income were selected; (2) Individuals whose taxable income was 
not positive (which means their total tax deductions were not less than their 
earnings) were dropped; and (3) We further dropped individuals whose non¬ 
work related deductions formed a significant part of their taxable income— 
specifically, we dropped those individuals whose work related deductions 
were less than 90 percent of earnings when the total deductions were more 
than 10 percent of earnings; whose total deductions were over 50 percent of 
earnings; or whose total deductions were all nonwork related and the total 
deductions were over 10 percent of their earnings. 

The final sample for analysis contains 9685 records of individual tax re¬ 
turns. By a preliminary analysis, we found that about 26% of singles bought 
PHI in 2003-2004, and the PHI take-up rates for those whose annual taxable 
incomes were below $50,000 and those whose annual taxable incomes were 
above that level were quite different. The PHI take-up rate for the former 
group was about 21% and it was about 57% for the latter group. Because 
ATO perturbed the income data by multiplying each original income obser¬ 
vation by a random number, we used the income variable in log scale in our 
analysis, so that the additive measurement error assumption in (2) is valid 
here. 

We then use our proposed jump detector (7) to estimate the jump posi¬ 
tion, in which the possible jump location is searched within [10,11.25] (or, 
equivalently, [$22,026, $76,879] of annual taxable income). The results are 
shown in Figure 5, where the estimated function of p{-) by the local linear 
kernel (LLK) smoothing procedure is shown by the solid line, the left-sided 
and right-sided estimates of p{-) are shown by the dashed and dotted lines, 
respectively, their difference is shown by the dot-dashed line at the bottom 
of the plot, and the jump location estimate and the corresponding 95% 
confidence interval for s are shown by the vertical dot-dash and long-dash 
lines, respectively. In the plot, the related estimates look noisier near the 
right end because there were fewer people who had high incomes. The esti¬ 
mated jump location is = 10.9255 («$55,575). The bandwidth chosen by 
the bootstrap procedure is 0.0792. The 95% confidence interval for s com¬ 
puted by the proposed bootstrap procedure is (10.8792, 11.0158) [^^($53,061, 
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$60,827)], which implies that the true jump location s is signihcantly larger 
than 10.8198 (~$50,000). This finding confirms our intuition that people 
usually act later than they are hit by the MLS. The estimated jump mag¬ 
nitude by (2.7) is 0.19. This number shows that the local effect at the MLS 
tax policy discontinuity is quite big. For individuals with only one income 
source, the policy can be considered locally exogenous because the observa¬ 
tions to the left and right of (but close to) the jump position are more or less 
homogeneous except the policy treatment. It implies that, among the indi¬ 
viduals whose annual taxable income is around $55,575, MLS brings about 
an extra 19% of them onto the private health system. This also implies a 
negative price elasticity of PHI demand since the jump in the take-up rate 
can be seen as a response to a price discount in the premium. 

6. Concluding remarks. We have proposed a generalized error-in-variab¬ 
les jump regression model for describing the relationship between people’s 
annual taxable income and the PHI take-up rate in Australia. A novel jump 
detector is proposed as well, which can accommodate the possible measure¬ 
ment errors. A major feature of the proposed method is that it does not 
require much prior knowledge on the measurement error distribution, mak¬ 
ing it applicable in practice. Its performance is evaluated by both numerical 
studies and theoretical justifications. By the proposed method, we found 
that the actual jump in the PHI take-up rate, caused by the MLS tax pol¬ 
icy, occurred at a larger taxable income value than the threshold value used 
in the policy. 

There is much room for further improvement of the current method. First, 
the proposed jump detection method assumes that there is a single jump 
point at an unknown location. By the framework of jump regression analysis 
[cf. Qiu (2005)], it might be possible to extend it to cases when there are 
multiple jump points and the number of jump points could be either known 
or unknown. Second, this paper focuses on jump detection only. It requires 
much future research to develop an appropriate method to estimate a jump 
regression function from observed data with measurement errors. Third, it 
might be important to extend the current method to higher-dimensional 
cases. 

Acknowledgments. We thank the Editor, the Associate Editor, and the 
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SUPPLEMENTARY MATERIAL 

Supplement to “Jump detection in generalized error-in-variables regres¬ 
sion with an application to Australian health tax policies” 

(DOL 10.1214/15-AOAS814SUPP; .pdf). This supplemental file mainly gives 
the proof of Theorem 1. 
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